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Abstract 

Given a metric space (X, dx), c > 1, r > 0, and p, q e [0, 1], a distribution over mappings ,y<f : X — » N 
is called a (r, cr, p, g)-sensitive hash family if any two points in X at distance at most r are mapped by ,y<? 
to the same value with probability at least p, and any two points at distance greater than cr are mapped 
by Jf? to the same value with probability at most q. This notion was introduced by Indyk and Motwani 
in 1998 as the basis for an efficient approximate nearest neighbor search algorithm, and has since been 
used extensively for this purpose. The performance of these algorithms is governed by the parameter 
p = log^] , and constructing hash families with small p automatically yields improved nearest neighbor 
algorithms. Here we show that for X = t\ it is impossible to achieve p < This almost matches the 
construction of Indyk and Motwani which achieves p < -. 

1 Introduction 

In this note we study the complexity of finding the nearest neighbor of a query point in certain high di- 
mensional spaces using Locality Sensitive Hashing (LSH). The nearest neighbor problem is formulated as 
follows: Given a database of n points in a metric space, preprocess it so that given a new query point it is 
possible to quickly find the point closest to it in the data set. This fundamental problem arises in numerous 
applications, including data mining, information retrieval, and image search, where distinctive features of 
the objects are represented as points in E. d . There is a vast amount of literature on this topic, and we shall 
not attempt to discuss it here. We refer the interested reader to the papers iDEHUEl, and especially to the 
references therein, for background on the nearest neighbor problem. 

While the exact nearest neighbor problem seems to suffer from the "curse of dimensionality", many 
efficient techniques have been devised for finding an approximate solution whose distance from the query 
point is at most c times its distance from the nearest neighbor. One of the most versatile and efficient 
methods for approximate nearest neighbor search is based on Locality Sensitive Hashing, as introduced 
by Indyk and Motwani in 1998 [6|. This method has been refined and improved in several papers- the 
most recent algorithm can be found in 0]. We also refer the reader to the LSH website, where more 
information on this algorithm can be found, including its implementation and code- all this can be found 
at http : //web . mit . edu/ andoni/www/LSH/index . html The LSH approach to the approximate nearest 
neighbor problem is based on the following concept. 

Definition 1.1. Let (X,dx) be a metric space, r,R > and p,q e [0, 1]. A distribution over mappings 
Jif : X — » N is called a (r, R, p, q)-sensitive hash family if for any x,y e X, 

• d x (x,y) < r => Pr[Jf (x) = Jf(y)] > p . 

• d x (x,y) > R => Pr[^f (x) = Jif(y)] < q . 
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Given c > 1 we define 

Px(c) - sup inf I — ^ — : 3(r, cr, p,g) - sensitive hash family : X — > n! . (1) 
r>o llog(l/g) " J 

Of particular interest is the case X = £ d s , for some s > 1 and J £ N. In this case we define 

p s (c) = lim sup p f d(c) . 

d— >oo 

The importance of these parameters stems from the following application to approximate nearest neigh- 
bor search. It will be convenient to discuss it in the framework of the following decision version of the 
c-approximate nearest neighbor problem: Given a query point, find any element of the data set which is at 
distance at most cr from it, provided that there is a data point at distance at most r from the query point. 
This decision version is known as the (r, cr)-near neighbor problem. It is well known that the reduction to 
the decision version adds only a logarithmic factor in the time and space complexity 0E1. The following 
theorem was proved in [ 6 1 ; the exact formulation presented here is taken from 

Theorem 1.2. Let (X, dx) be a metric on a subset ofW 1 . Suppose that (X, dx) admits a (r, cr, p, q)-sensitive 
hash family Jif, and write p — Then for any n > | there exists a randomized algorithm for (r, c) 

near neighbor on n-point subsets ofX which uses 0(dn + n l+ p) space, with query time dominated by 0(n p ) 

distance computations and O (n p log^ nj evaluations of hash functions from J$f. 

Thus, obtaining bounds on px(c) is of great algorithmic interest. It is proved in [ 6 1 that p\{c) < l/c, 
and for small values of c, namely c e [1, 10], is was shown in [4] that this inequality is strict. We refer 
to [4 1 for numerical data on the best know estimates for pi(c) for small c. For s - 2 a recent result of 
Andoni and Indyk [1] shows that P2(c) < l/c 2 , and for general s e [1,2] the best known bounds [4] are 
p s (c) < max{l/c, l/c*}. 

The main purpose of this note is to obtain lower bounds on pi(c) and p2<c) which nearly match the 
bounds obtained from the constructions in [(OIUQ. Our main result is: 

Theorem 1.3. For every c, s > 1, 
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e^ - 1 e - 1 1 0.462 



Ps (c) > >—-.-> — — . (2) 

e jf + i e + 1 c s c s 

The second to last inequality in © follows from concavity of the function t i-> on [0, oo). Observe 

also that as c — > oo, ~ i. It would be very interesting to determine limsup c ^ 00 c • pi(c) exactly- due 
to Theorem ll.3l and the results of 1 6 1 we cunently know that this number is in the interval [1/2, 1]. 



2 Proof of Theorem O 

The basic idea in the proof of Theorem 1 1.31 is simple. Choose a random point x e {0, \ ) d and consider the 
random subset A of the cube {0, l} d consisting of points u for which J^(u) = J^f(x). The second condition 
in Definition 11.11 forces A to be small in expectation. But, when A is small we can bound from above the 
probability that after r steps, the random walk starting at a random point in A will end up in A. We obtain this 
upper bound using a Fourier analytic argument, and in combination with the first condition in Definition ll.il 
we deduce the desired bound on pi(c). 

Theorem 1 1 . 3l follows from the following result: 
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Proposition 2.1. Let Jfl? be a (r,R,p, q)-sensitive hash family on the Hamming cube ({0, l} d , \ \ ■ ||i). Assume 
that r is an odd integer and that R < i. Then 



p < \q + e dKi K ) \ 

Choosing R « i - ^fdlogd and r ~ R/c in Proposition 12. II and letting d — > oo, yields Theorem 1 1.31 in 
the case s = 1. The case of general s > 1 follows from the fact that for x,y € {0, \} d , \\x - y\\ s = \\x - y\\^ s . 
The proof of Proposition 12. ll will be broken into a few lemmas. 

Lemma 2.2. Let be a (r,R, p,q)-sensitive hash family on the Hamming cube ({0, \} d , \ \ ■ \\\), and fix 
xe {0, \} d . Then 

EK->yr W) |<g « + j; (j. 

Proof. We simply write 

E|jr -1 (jr(*))| - ^] Pr[JT(«) = 

we{0,l}<' 

< \{u e {0, l} rf : \\u - x\\i < R}\ + q ■ \{u e {0, 1}^ : \\u - x\U > R}\ 

k=Q V ' k=lR\+l V ' 

□ 

Corollary 2.3. Assume that R < i. Then, using the notation of Lemma \2.2\ we have that 

^\^- l {^{x))\<2 d (q + e~^ R ^ . 

Proof. This follows from Lemma l2~2l and the standard estimate J^ k< d_ a (fy < 2 d ■ e~~? . □ 

Lemma 2.4 (Random walk lemma). Let r be an odd integer. Given + B c {0, \ } d , consider the random 
variable Qg £ {0, \ } d defined as follows: Choose a point z € B uniformly at random, and perform r-steps of 
the standard random walk on the Hamming cube starting from z.- The point thus obtained will be denoted 
Qb- Then 

~ d ) 

Proof. We begin by recalling some background and notation on Fourier analysis on the Hamming cube. 
Given 5 c {1, . . .d}, the Walsh function W s : {0, l} d {-1, 1} is defined by 

W s (u) = (-1)^5"; . 

For / : {0, \} d — » R we set 



MQb 6 B] < 
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so that / can be decomposed as follows: 



/= £ f(S)W s 

Sc{l,...,rf} 



For every /, g : {0, 1 } d — > R we write 

«e{0,l) rf 

By Parseval's identity, 

</,£> = J] 7(5)?(5) . 
Sc{l,...,d) 

For e e [0, 1] the Bonami-Beckner operator T £ is defined as 

Sc{l,...,d} 

The Bonami-Beckner inequality 0121 states that for every f : {0, \ } d —> R, 

2 

^ £ 2|5| /(S) 2 = Iir e /|| = - ^ (Tsf(u)f < \\f\\ 2 l+E2 = Td ^ m^ 2 

SQ{l,...,d) ue{0,l} d 



2 d 

\ ue{0,ll rf 



Specializing to the indicator of B c {0, 1 } d we get that 

Z ^'w) 2 < (Hp . o) 



Sc{l,...,rf} 



Now, let P be the transition matrix of the standard random walk on {0, l} d , i.e. P uv - l/difu and v differ 
in exactly one coordinate, P uv = otherwise. By a direct computation we have that for every S Q{l,...,d], 

PWs={l-^\w s , 

i.e. Ws is an eigenvector of P with eigenvalue 1 - The probability that the random walk starting form 
a random point in B ends up in B after r steps equals 



Sc{l,...,d) \ 

1 1 ^rll /fl \ 



SQ\\,...,d) 
\S\<d/2 



2\S\Y 

d ) 

2\S\\ r 
d 
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where we used the fact that r is odd (i.e. we dropped negative terms). 
Thus, using © we see that 



M Q B eB]<— Y 1 B {S) 2 ■ e~ 2r ^ c < — .(B 



SQ{\,...,d) 



-2/7i 



l-e- 2r { c 



Proof of Proposition 12. 71 Assume that r is an odd integer and R < |. For x e {0, \} d let W r (x) e {0, l} d be 
the random point obtained by preforming a random walk for r steps starting at x. Since \\x - W r (x)\\\ <rwe 
know that Pr ( W r (x)) = J^f(x)] > p. Taking expectation with respect to the uniform probability measure 
on {0, 1 } d we deduce that 

P < E xmi] nPr[J4f(W r (x)) = Jl?(x)] 

= Ej?> Pr [x e {0, 1}" : W r (x) e ^T" 1 (•#"(*))] 

^ Pr [x e {0, 1}" : W r (x) e Jf 7 ^ 1 (Jff(x)) A JT(jc) - it] 



few 



1 - d ^ [Q^- Kk) eJSf-\k)] 



\je-\k)\ (\j^-\k)\ 



2' 1 



2 d 



,2r/d +l 



x<={0,l) d 



\j?- l (j?(x))\y^ 



2 d 



+ 1 



i.w\.w-\.w{x))\y™ 

2 d 



(4) 



(5) 



+ e 



(6) 



where in @ we used Lemma |2~31 in (J5J) we used Jensen's inequality, and in © we used Corollary 12.31 □ 
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